2 research outputs found

    Interface Development for Digitization of Documents Using OCR

    Get PDF
    The purpose of this thesis is to develop a semi-automated interface that uses Optical Character Recognition (OCR) routines to identify text-based information from a large volume of digitized drawings associated with the oil and gas industry. The identified information is presented in an appropriate interface for any necessary manual modification, with the target of improving the efficiency of maintaining large amounts of older documents. The thesis outlines the design of the interface and the implementation of Tesseract OCR engine, in combination with tailor-made functions and classes that leverage OpenCV to enhance the recognition processThe purpose of this thesis is to develop a semi-automated interface that uses Optical Character Recognition (OCR) routines to identify text-based information from a large volume of digitized drawings associated with the oil and gas industry. The identified information is presented in an appropriate interface for any necessary manual modification, with the target of improving the efficiency of maintaining large amounts of older documents. The thesis outlines the design of the interface and the implementation of Tesseract OCR engine, in combination with tailor-made functions and classes that leverage OpenCV to enhance the recognition proces

    Interface Development for Digitization of Documents Using OCR

    Get PDF
    The purpose of this thesis is to develop a semi-automated interface that uses Optical Character Recognition (OCR) routines to identify text-based information from a large volume of digitized drawings associated with the oil and gas industry. The identified information is presented in an appropriate interface for any necessary manual modifica- tion, with the target of improving the efficiency of maintaining large amounts of older documents. The thesis outlines the design of the interface and the implementation of Tesseract OCR engine, in combination with tailor-made functions and classes that lever- age OpenCV to enhance the recognition process.The purpose of this thesis is to develop a semi-automated interface that uses Optical Character Recognition (OCR) routines to identify text-based information from a large volume of digitized drawings associated with the oil and gas industry. The identified information is presented in an appropriate interface for any necessary manual modifica- tion, with the target of improving the efficiency of maintaining large amounts of older documents. The thesis outlines the design of the interface and the implementation of Tesseract OCR engine, in combination with tailor-made functions and classes that lever- age OpenCV to enhance the recognition process
    corecore